Claude/unified query planner a w8ax by AdaWorldAPI · Pull Request #65 · AdaWorldAPI/ndarray

AdaWorldAPI · 2026-03-30T19:56:45Z

No description provided.

F32x8 and F64x4 only had Add + Mul. AVX2 fallback for F32x16 needs all four arithmetic ops on the 256-bit types. Additive only — no existing code changed. https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

F32x16, F64x8, U8x64, I32x16, I64x8, U32x16, U64x8 — all composed from 2× AVX2 halves (F32x8/F64x4 for float, array loops for integer). Same API as simd_avx512.rs types. simd.rs will LazyLock-dispatch between the two files based on runtime CPU detection. Add/Sub/Mul/Div on F32x16 dispatch to 2× F32x8 operations (AVX2). Integer types use array loops (AVX2 lacks 512-bit integer SIMD). https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

simd.rs now re-exports from simd_avx2 (2× __m256 composed types) instead of simd_avx512 (__m512 native) for all 512-bit types. This eliminates the SIGILL risk on x86_64 without AVX-512. The AVX2 composed types use 2× F32x8 per F32x16 operation — correct on all hardware, 2 instructions instead of 1 on AVX-512. BLAS hot paths (dot, axpy, gemm) still dispatch to AVX-512 kernels via native.rs LazyLock<Tier> — no performance regression for inner loops. The simd.rs types serve HPC consumer code. LazyLock<Tier> detection added to simd.rs (same pattern as native.rs). F32x8/F64x4 (256-bit AVX2 base types) always re-exported from simd_avx512. 1422/1423 tests pass (1 pre-existing causal_diff failure). https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

- Remove burn from exclude list — all crates now in workspace - Add [lib] section to burn Cargo.toml (edition 2024 requires explicit target) - p64: 23 tests pass, phyllotactic-manifold: 14 tests pass - Full workspace compiles clean https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

7 sections, 11 tests, zero new types — only mappings: 1. SIMD manifold: expand_manifold_simd() via F64x8 + SPIRAL7_X/Y 2. SIMD attention: attend_batch_8() with VPOPCNTDQ fast path via simd_caps() 3. NARS bridge: resonance_to_nars(), nars_to_branch_byte() 4. CausalEdge64 compat: bit layout, palette addressing, layer mask mapping 5. ThinkingStyle cache: 6 styles in LazyLock, ordinal + name lookup 6. Semiring mapping: semiring name → CombineMode + ContraMode 7. DeepNSM palette: distance matrix → Palette64 interaction bitmap Re-exports: Palette64, Palette3D, ThinkingStyle, HeelPlanes, CombineMode, ContraMode, predicate, manifold_consts p64 + phyllotactic-manifold added as path deps in Cargo.toml. https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

claude added 7 commits March 30, 2026 19:23

feat(simd): add Sub/Div for F32x8 + F64x4 (AVX2 arithmetic completeness)

961409c

F32x8 and F64x4 only had Add + Mul. AVX2 fallback for F32x16 needs all four arithmetic ops on the 256-bit types. Additive only — no existing code changed. https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

chore: update Cargo.lock

0c0dcb4

https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

chore: update Cargo.lock after workspace exclude removal

3bfa2f4

https://claude.ai/code/session_01BTATTRUACijvsK4hqmKUBR

AdaWorldAPI merged commit 9bd8cc7 into master Mar 30, 2026
5 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude/unified query planner a w8ax#65

Claude/unified query planner a w8ax#65
AdaWorldAPI merged 7 commits into
masterfrom
claude/unified-query-planner-aW8ax

AdaWorldAPI commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants